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Abstract 



The essential part of abstract interpretation is to build a machine-representable abstract domain 
expressing interesting properties about the possible states reached by a program at run-time. 
Many techniques have been developed which assume that one knows in advance the class of 
properties that are of interest. There are cases however when there are no a priori indications 
about the "best" abstract properties to use. We introduce a new framework that enables non- 
unique representations of abstract program properties to be used, and expose a method, called 
dynamic partitioning, that allows the dynamic determination of interesting abstract domains 
using data structures built over simpler domains. Finally, we show how dynamic partitioning 
can be used to compute non-trivial approximations of functions over infinite domains and give 
an application to the computation of minimal function graphs. 



Resume 



L'une des principales difficultes de 1' interpretation abstraite consiste a construire un domaine 
abstrait, representable en machine, qui permette d' exprimer un ensemble de proprietes suffisant 
a decrire de maniere precise 1' ensemble des etats dans lequel peut se trouver un programme 
lorsqu'il est execute. De nombreuse techniques d'interpretation abstraite ont ete developpees 
a partir de I'hypothese que la classe des "bonnes" proprietes est, des le depart, bien identifiee. 
Cependant, dans de nombreux cas, il n'y a aucune indication a priori quant a I'interet relatif 
des differentes classes de proprietes envisageables. Nous presentons ici une nouvelle methode, 
appelee partitionnement dynamique, qui autorise la determination dynamique des "bonnes" 
proprietes par 1 ' utilisation de structures de donnee construites a partir d' approximations simples 
du domaine concret. Nous montrons en particulier comment des approximations finies et non 
triviales de fonctions sur des domaines infinis peuvent etre calculees de maniere automatique, 
et nous doimons une application au calcul des graphes fonctioimels minimaux. 
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1 Introduction 

Abstract interpretation, as defined in Cousot [2, 4, 6], provides a general framework aimed 
at computing invariance properties of programs. These properties describe the run-time states 
that can be reached from a set of initial states Pq by means of a transition function r over 
the subsets of the set of run-time states S, which defines the operational semantics of a given 
program. When r is continuous over the lattice (P(5), C), the invariant I = UneN ''""(-Po) is 
also the least fixed point Ifp(O) of the function <I> = \P.{Pq U t{P)). In most cases however, 
S is infinite, and methods must be developed to determine a safe and finitely represented 
approximation of this least fixed point. Patrick and Radhia Cousot introduced the notion of 
Galois connection (0,7) as a general tool for building such approximate frameworks. The 
abstraction function a maps a set of states P to an element P* of a finitely represented abstract 
(approximate) lattice (P*(5), C) whereas the concretization function 7 maps an abstract state 
P* to a set of states, called its meaning. Then, by defining a safe approximation ^* of ^, i.e., 
a function such that <I>* □ a o <I> o 7, one can determine an approximate invariant I* = lfp(^*) 
which is a safe approximation of I, i.e., 7(1*) 5 I- 

However, when the approximate lattice is of infinite height, the iterative computation of the 
approximate invariant may not finitely converge, and speedup techniques, such as widening 
and narrowing, must be used to determine a safe approximation of I*. But in many cases, there 
is not even a clear indication about how to build a good and finitely represented abstract lattice. 
This happens when there is no indication about what the least fixed point will "look like", 
and therefore no advance knowledge of the properties that should be expressed in the abstract 
lattice P*(5) to precisely describe the invariant I. Differently stated, there is a gap between 
the exact lattice P(5) and the abstract lattices that one can a priori build or think about. 
This is true in particular when functions over infinite domains are approximated. Moreover, 
most implementations of abstract interpretation have to deal with the problem of testing the 
equivalence of the data structures used to represent lattice elements (i.e., testing the equality 
of their meaning). This test is often very costly and difficult to implement, as with the abstract 
interpretation of functional or logic programs for instance, and it would be desirable to design 
a framework that avoids such a test. 

The aim of this paper is to discuss a technique, which we call dynamic partitioning, that 
can be used to compute non-trivial, safe approximations of program invariants in the above 
cases, by dynamically selecting interesting and finitely represented abstract properties without 
having to test the equality of their meaning. 

We shall first recall, in section 2, the classical definition of a widening operator, and then 
describe, in section 3, a framework that generalizes the classical lattice-oriented framework 
to cases where no Galois connection can be defined and where testing the equivalence of 
data structures can be avoided by using properly generalized widening operators over general 
partial orders. We shall then discuss in section 4 several classical situations in interprocedural 
abstract interpretation and show how our technique can be applied in each case. We will show 
in particular how non-trivial approximations of functional fixed points can be computed by 
using our framework and an adequate data structure. Finally, we shall give in section 5 two 
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practical applications and show how dynamic partitioning can be used to effectively compute 
non-trivial, safe approximations of minimal function graphs. 

2 Widening operators 

The Galois connection approach described above is perfectly adequate when the abstract 
lattice P*(5) is of finite height, since the iterative computation of the least fixed point of 
over P*(5) will necessarily converge in a finite amount of time. However, some interesting 
program properties, such as the range of integer variables, are expressed in a lattice of infinite 
height. Even if the integer variables were bounded, choosing for instance Z = , . . . , a;"^}, 
a;- = -2"'-\ 0;+ = 2"'-^ - 1, the interval lattice I(Z) is still of height T" , and fixed point 
computations may in theory require up to the same number of iterations. A speedup technique 
has been proposed in Cousot [2] that uses so called widening operators to transform infinite 
iterative computations into finite but approximated ones. So let us suppose for instance that 
one has a program function Loop defined in ML as follows: 

fun Loop i = if i < 100 then 
Loop (i + 1) 

else 

i 

Suppose now that one wants to compute the values returned by Loop for a set of input data 
specifications. Loop being recursive, this computation may require computing the value 
returned by Loop for arguments that were not present in the initial data set. Therefore, the 
goal of an interprocedural abstract interpretation framework will be to determine this minimal 
function graph describing the minimal information about Loop needed to compute its value 
for every argument in the original specification. This notion of minimal function graph was 
first introduced in Jones and My croft [10], but was in essence already present in Cousot [5]. 
A program state (a, 6) G Z x Zj^ therefore consists of an input value a and a return value 
h = Loop(a), where the special value _L denotes nontermination. Generalizing the idea used 
by Jones and Mycroft for constant propagation, we can approximate the minimal function 
graph of Loop by a pair of intervals representing an approximation of all of Loop's arguments 
and all of Loop's results. This approximate minimal function graph is therefore the least fixed 
point X* of the monotonic function <I>* over the lattice I(Z) x I(Z) defined as follows: 

^*{i,v) = {io,±)\/ {^t(i),^t(hv)) 

where io is the input data specification, and the two functions <I>f and ^2 defined by: 

^f(i) = incr*(i A [a;-,99]) 
<D*(i,t;) = V(iA [100,a;+]) 

where the strict abstract function incr* is defined by: 
incr*(_L) = _L 

incr* [a, 6] = [min(a + 1, a;"^),min(6 + 
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The least fixed point X* is known to be the upper limit of the increasing chain: 

= 

But this limit can take a very long time to compute. For instance, if zq = [0, 0], this least fixed 
point is equal to ([0, 100], [100, 100]) and is reached after 102 iterations. One must therefore 
find a way to speedup this computation in order for it to be tractable. To this end, we introduce 
the widening operator Vi over I(Z), taken from Cousot [2] p. 247, or [6] p. 334: 

_LVia; = a;Vi_L = x 

[ai,6i] Vi [02,62] = [if a2 < tti then else tti, 

if 62 > bi then uj'^ else 61] 

This non-commutative operator generalizes "unstable" bounds of its right argument. It is a 
safe approximation of the join operator, and is such that for every increasing chain (a:n)neN' 
the increasing chain (2/n)neN defined by: 

2/0 = Xo 

2/n+l = Vn Vi Xn+l 

is always eventually stable, i.e., there exists a no such that: V n > no : 2/n = 2/no • Under 
these assumptions, it is well known ([6], theorem 10-30, p. 334) that the upper limit y* of the 
increasing chain: 

. ^n*+i = Y* if ^*{Y*) < Y* 

where the widening operator is applied componentwise, is a post fixed point of <!>*, i.e., 
< y*, and, therefore, is a safe approximation of X*, i.e., X* < y*. Note that since 

here: 

y < X xViy = X 

this chain could be simply defined by: 

So for example, with input data specification io = [0,0], one can compute the increasing 
chain: 

Yo' = 

Y* = ([0,0], ±) 

Y* = {[0,u;%±) 

Y* = Y* = ([0,a;+],[100,a;+]) 

whose limit y* is a safe approximation of the least fixed point: 

X* = ([0,100], [100, 100]) 
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This result could also be improved by using the narrowing operator Aj defined by: 

_LAia; = a;Ai_L = _L 

[ai , 61] Ai [a2 , 62] = [if ai = then 02 else min(ai , 02), 

if bi = c<;+ then 62 else max(6i , 62)] 

This operator satisfies the canonical condition: 

\fx,yEl(X) : x>y x > xAiy > y 

and is such that for every decreasing chain (2/n)neN' the chain (2n)neN defined by: 

= 2/0 

= Zn Ai y^+i 

is always eventually stable. It is known that the lower limit of the decreasing chain: 

starting from the post fixed point Y* , is a safe approximation of X*. On our example, this 
gives, after only 2 iterations: 

Z* = ([0,a;+],[100,a;+]) 

Z* = Z* = ([0,100], [100, 0;+]) 

This example shows how good results can be obtained by using very naive and "brute force" 
widening and narrowing operators. Of course, it might be argued that the interval [0, 100] 
inferred by the computation could have been easily determined by simply looking at the text of 
the program, and that a finite abstract lattice could thus have been built a priori. We shall see, 
in section 5.2, an example that shows that this is not always the case, and in practice, building 
ad-hoc approximate functional lattices is simply not feasible. However, since we already know 
how to deal with intervals, it is very tempting to describe minimal function graphs by sets of 
interval pairs, instead of a single pair of intervals. The advantage of such a representation 
is that it is very flexible, and does not establish in advance any particular tradeoff between 
complexity and precision. 

There is however a difficult problem to solve if we want to use this approach, in that there 
is no canonical representation of abstract minimal function graphs. For example, it is quite 
clear that the two sets {([1, 2], [0, 0])} and {([1, 1], [0, 0]), ([2, 2], [0, 0])} are equivalent in 
the sense that they represent the same minimal function graph, but no one is "better" than the 
other. What we would like to do however, is to work with such representations, even though 
they are not unique, and still ensure the convergence and safeness of every computation. The 
traditional complete lattice framework is clearly not appropriate in this case, so we need to 
generalize the approach to general partial orders. 
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3 Representations 

Definition 1 Let (Z),_L,C) he a countable complete partial order (cpo), R a set, and 
J : R ^ D a meaning function. Then (R, ^, 7, V) w said to be a representation of D if: 

i) (R, ±, ^) is a partial order 
ii) The meaning function 7 is monotonic 

Hi) Each element d E D can be safely abstracted by an element a{d) G R, i.e.: 

7(a((i)) □ d 



iv) Each binary operator Vi : R X R ^ R of the sequence V = (\^)i£N such that: 

yr,r' G R : 



r < rVir' 



v) For every {T"i}i£N ^ chain {r\)i^^ defined by: 



To = To 

T'i+l = T'iViTi+i 



has an upper bound. 



A representation is said to be complete if(R, ^) is a cpo, finite if every element of R has a 
finite encoding, and tractable if the chain {r[)i^f^ is always eventually stable. 



This definition has some similarities with the definition of the upper approximation {D ,<)of 
a complete lattice {D, C) using a Galois connection, i.e., a pair of monotonic functions (a, 7), 
a: D ^ D* snAi : D* ^ D such that: 

V((i, d*)eDxD* : a{d) < d* ^ dn j(d*) 

The difference between the two definitions is that our framework makes very weak assumptions 
about R and D and generalizes the case where R and D are both complete lattices, since 
we only require that a be safe. As a matter of fact, a is not even needed in the framework, 
and only the existence of a safe approximation for every concrete element is required. This 
allows in particular different representations to have the same meaning and one can choose 
arbitrarily between them, hence the term representation. Therefore, the traditional inequality 
aoj < Idn, which becomes an equality when 7 is one-to-one, does not hold in this framework. 

Each elementary widening operator \^ of the widening operator V = (\^)i£N is an 
alternative to a join operator over the abstract lattice which does not necessarily exist if R 
is a partial order. The conditions imposed on \^ simply ensure that safe and increasing chains 
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over R can be built from increasing chains over Z), as we shall see below. When R and D 

are both complete lattices, the operator V of a tractable representation is a straightforward 
generalization of the classical widening operator, in the sense that j{rVi r') □ 7(7-) U 7(7-') and 
every increasing chain built using V is eventually stable. 

Another remark concerning this framework is that condition {v) is trivially satisfied for any 
complete representation, for every increasing chain has a least upper bound. Differently stated, 
the widening operator V is not necessary in a complete representation to prove the existence 
of a least upper approximation. But even so, it can be very interesting to have such a widening 
operator to define finite and tractable frameworks, as we have seen in section 2. Also, note 
that the use of widening operators over non-complete lattices was already present in Cousot 
and Halbwachs [3], where the lattice of finitely represented convex hulls is not complete. 

Finally, it should be noted that our framework can be very easily generalized to cases where 
^) is only a preorder^ , in which case the meaning function need not be monotonic and the 
conditions imposed on the elementary widening operators must be: 



Preorders have been used for instance in Stransky [12]. However, the problem with such 
very general frameworks is that not much can be said about them, for they are essentially 
defined by the properties of the widening operator V. Moreover, preorders are not very 
easy to work with, for representations having the same meaning can "oscillate" during the 
computation and one must be able to finitely compute the equivalence of representations (i.e., 
7(7") = 7(7-')) to detect the stabilization of increasing chains. As we noted earlier, this is not 
necessary here since stabilization is detected by the equality of representations (i.e., r = r'). 
Hence, our framework is perfectly suited to cases where R is implemented using very complex 
data structures for which the equivalence test is intractable or very costly, since we require 
that equivalent representations be comparable only when they are "similar enough". It is 
interesting to note that a comparable idea, which was only a heuristic at the time, was used in 
the design of the widening operator of Cousot and Halbwachs [3] which preserves as much as 
possible the representations of convex hulls during iterative computations. 

So let (itl, ^, 7, V) be a representation of D, and ^ G -D ^ -D be a continuous function, 
that is, a monotonic function such that for every directed subset CCD: ^(|J C) = |J ^(C). 
It is well known that the least fixed point of ^ is: 



If the elements of D do not have a finite encoding, it may be impossible to compute this 
increasing chain. So let us suppose that one can define a safe approximation of 0 operating 
over the set of (supposedly finitely encoded) representations R, that is, a function <I>* such 
that: 




7(7- \^ 7"') 
7(7- \^ 7"') 



Ifp(^) 



7 o ^ □ 007 



^ A preorder is a reflexive and transitive binary relation. 
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One can choose in particular^ any function <I>* such that ^* y 00^07, for by monotonicity 
of 7 and property (iii): 

7 o <!)* □ (7oa)o<I>o7 □ ^07 

but this is not mandatory. The only thing we really need is any safe approximation of <I>. Our 
problem is now: how can we compute a safe representation of ^ using <I>* which is the only 
function that we can possibly compute? To this end, let us define the chain (ri\^j^ by: 

ro = 1. 
Vi+i = Ti Vi <I>*(r-i) 

Theorem 2 The chain (T"i)i£N increasing chain over R that has an upper bound 
which is a safe representation of the least fixed point of^, that is: 

□ (I) 



Proof. Let us first define the increasing chain ((l>i)i^j^ by ^0 = -L and ^i+i = It is 

clear that 7 (r"o) □ -L = ^o- Suppose by induction that 7 (r-^) □ Then by definition of the 
widening operator V: 

7(^i+i) = 7(^i Vi <I>*(r-i)) □ 7(<I>Vi)) 
But ^* being a safe approximation of and by monotonicity of ^: 

l{^\ri)) □ <I>(7(r-i)) □ = </>i+i 

which proves that each ri is a safe approximation of But thanks to the first property of 
V, Ti+i y Ti, and (T"i)i£N is an increasing chain which has an upper bound r^^. Finally, by 
monotonicity of 7 : V z : C 7(7-^) C jir^), which implies that ^ C jiru)- ■ 



Using a finite and tractable representation, one can therefore compute a non-trivial, safe and 
finitely represented approximation of the least fixed point ^ of any continuous function <I> over 
D, even if the representation R does not have a maximum element (which would be of course 
a trivial safe approximation of any element of D). 

However, as in section 2, it is possible to compute an even better representation r'^ of 
the least fixed point of <I> by defining a narrowing operator A = (Ai)^^^ such that every 
elementary narrowing operator A ^ satisfies: 



yr,r'eR, y(j)eD : (j) Q j{r), j{r') 



r Air' ^ r 
(j) C 7(r-Air-') 



^ As in Cousot [6] p. 331 for instance. 
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and computing the lower limit of the decreasing chain (r"^)^^^ defined as follows: 

Note that, once again, the first condition imposed on enforces the chain condition whereas 
the second condition enforces the safeness of the computation, that is, this condition ensures 
that the narrowing operator will not "jump below" the least fixed point, as shown by the 
following theorem. 

Theorem 3 When the decreasing chain (r[)i^^ is eventually stable, its lower limit is a safe 
representation of the least fixed point of^. 

Proof. We first note that t'q = r^^ being a safe representation of (f), we have: 

7(^0) 3 </> 

Now, suppose by induction that: 
then obviously, by monotonicity of ^: 

and therefore: 

which shows that the lower limit r'^ = r'-^ for some zq G N is a safe representation of ^. ■ 

Note that, contrary to the classical widening/narrowing approach, we do not require that the 
meaning of the first element t'q = r-^^ be a post fixed point of <I>, which, consequently, avoids 
comparing 7(7-^) with 0(7(7"^)) at each stage of the iterative computation of r^^, as in section 
2. This property is very important if, as stated in the introduction, comparing the meaning of 
representations is very costly. However, if the elementary widening operator Vi satisfies the 
natural stability condition: 

\fr,r'ER : jir') C 7(7") r Vi r' = r 

as does the widening operator Vi over the interval lattice I(Z), then jir^^) will always be a post 
fixed point of <I>. Elementary widening operators satisfying this condition will be called stable. 
Note that complex widening operators, such as the ones that will be presented in the following 
sections, will not generally be stable. Intuitively, since we require that two representations 
r and r' be comparable only when they are "similar enough", the stability test 7(7-') C 7(7-) 
will be approximated, and, therefore, redundant information r' added to a representation r will 
sometimes lead to a loss of precision. 

Finally, note that widening and narrowing are not dual operations. However, for the sake of 
simplicity, we shall only focus on widening operators in the rest of this paper. 




March 1992 



Digital PRL 



Abstract Interpretation by Dynamic Partitioning 



9 



4 Dynamic partitioning 

The aim of this section is to build generic representations based on the idea of "non- 
redundancy". We shall first talk about what we call basic partitioning, which is a technique 
that can be used to build representations of a concrete complete lattice (X,_L,T,C,U,n) using 
well chosen subsets of A when A is an upper approximation of L. We shall then discuss two 
other methods, called basic functional partitioning and functional partitioning used to build 
representations of functions F : C ^ B over an infinite set C, using subsets of A x B when 
A is an upper approximation of P(C) and 5 is a lattice. We start by defining two notions of 
non-redundancy for the subsets of a lattice A. 

Definition 4 A subset P of a lattice A is said to be non-redundant if it does not contain _L 
and: 

\f a, a' E P : a < a' a = a' 

P is said to be strongly non-redundant if it does not contain _L and: 

ya,a'eP : a 7^ a' =^ a A a' = ± 

Non-redundant subsets are often called crowns or antichains. The set of non-redundant and 
strongly non-redundant subsets of A will be noted respectively Pnr(^) and Psnr(A). Two 
elements of a non-redundant subset are either equal or not comparable, whereas they are equal 
or have "nothing in common" if they belong to the same strongly non-redundant subset. Note 
that strong non-redundancy implies non-redundancy. 

4.1 Basic partitioning 

So let us suppose that A is an upper approximation of a complete lattice L and let (0,7) 
denote the Galois coimection between the two lattices. We wish to build a representation of L 
using the subsets of A. The most natural meaning of a subset P of A is of course: 

r(P) = ^TCa) 

that is, the least upper bound of the set of concrete elements denoted by the abstract elements 
of P. Let us define the binary relation < over P(A) by: 

P <P' ^ yaeP,3a'eP':a<a' 

This relation is similar to the preorder used to build the lower powerdomain (see Gunter and 
Scott [8] p. 653), sometimes called the Hoare powerdomain, or the relational powerdomain 
(see Schmidt [14], p. 295). The originality of our framework is that using well chosen subsets 
of P(A), we can turn this preorder into a partial order and avoid using principal ideals and 
complex power domains, as in Mycroft and Nielson [1 1] for instance. 

Theorem 5 (Piir(A), ^) and (Psnr(^), ^) are partial orders and F is monotonic over P{A). 
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Proof. < is obviously a preorder. So let P < P', P' ^ P, and a E P. Then there exists 
a' G P' and a" G P such that: a < a' < a", and by non-redundancy: a = a", which 
implies that: a = a' £ P', and hence P C P'. But similarly P' C P, and thus ^ is a partial 
order. Finally, it is easy to show that the monotonicity of 7 implies the monotonicity of P. 



Under more restrictive conditions, we can show that (Psnr(^), ^) is a complete partial order. 
Theorem 6 If A is meet-continuous^, then (Psnr(^), <) is a cpo. 

Proof. To show that (Psnr(A), ^) is complete, let (PiX^N be an increasing chain. Using the 
diagonal argument and the definition of <, one can build a (possibly infinite) set of increasing 
chains {Cj}j^j, J C N, Cj = (cjiXgN, such that for alH G N : Pi = {cji}j^j - {_L}. But 
A being a complete lattice, each increasing chain Cj has a limit Ij G A. The only possible 
candidate to the upper limit of the chain (Pi)ig]v} is therefore But then for all j 7^ j': 



which shows that {lj}jeJ is strongly non-redundant and is the least upper bound of the 
chain (PiXgN. ■ 

Under the light of this theorem, one might think that it is a good idea to limit oneself to the 
strongly non-redundant subsets of A, since they form a complete partial order, and appear 
to be "less arbitrary" than the general non-redundant subsets of A. But it depends a lot on 
the "shape" of the abstract lattice A. Intuitively, for strongly non-redundant subsets to be 
useful, every abstract element a £ A should be the least upper bound of a set of atoms. This 
can be formalized by saying that A should be an algebraic atomic lattice with a strongly 
non-redundant basis A of atoms such that: 



where 1 a = {a' £ A : a' < a} is the principal ideal generated by a. Standard examples of 
such lattices are (P(5), C) and the interval lattice I(Z), with bases {{s}}ses and {[i,i]}i£Z 
respectively. Counter-examples are complete total orders, for which every subset with more 
than two elements is necessarily redundant. More generally, one can prove the following 
theorem. 



^ A complete lattice L is meet-continuous (see Gierz [7] p. 30) if for every directed subset D C L and every 
element x e L: x a\/ D = \/{x Ad -.d e D} 



yiicjiAyCj,) 
Vi,i' ^ = 1- 



V a G A : a 
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Theorem 7 Let (A, _L , T , < , V , A) £»e a« upper approximation ofP(S), such that 7 is one-to- 
one, 7(-L) = 0, and: 

[s] ^ [s'] =^ [s] A [s'] = ± (where [s] = a({s})) 

Then A is an algebraic atomic lattice, A = {[s] : s ^ S} is a strongly non-redundant basis 
of A, and {7(a) : « G A} is a partition of S. Moreover, ifj is V -continuous, then A is 
meet-continuous. 

Proof. We first note that 7 being one-to-one, 007 = Id^, 7 is A-continuous and a is 
U-continuous (Cousot [4], theorem 4.2.7.0.3, p. 4.33). Suppose now that there exists s e S 
such that [s] = ±. Then [s] = a{{s}) < _L and by definition of Galois connections, 
{^} ^ 7(-L) = 0 which is impossible. Thus A is strongly non-redundant. But: 

xeAnla <^ 3seS : X = a{{s}) < a 

3s e S : X = [s]A{s} C 7(a) 
<S=^ 3s e 7(a) : X = [s] 

Therefore An [a = /3(a) = {[s] : s G 7(a)}. In order to show that a = V /3(a), we shall 
first show that 7(a) = \Jaefi{a)l(^- 7 o a D Idp(5) and thus: 

5 U.G7(a){4 = 7(«) 

Conversely, let x G UsG7(a)T([^])- Then there exists {s} C 7(a) such that {x} C 7([s]), 
and hence by monotonicity of a: 

a({x}) < [s] = a({s}) < a(7(a)) = a 

which implies that {x} C 7(a), that is, x G 7(a). Therefore: 

a = (aoj)(a) = a(U„e/3(„)7(«)) 
= VaG/3(a)(a ° 7)(«) = V/3(a) 

Now, when 7 is V-continuous, then for every a E A and every subset X C A: 
j(aA\/X) = j{a)n\J{j(x):x e X} 

= U {7(a) n jix) -.xex} 

= U {7(«A a;) : a; G X} 
and therefore, a being U-continuous: 

aAV^ = (ao7)(aAVX) 

= y {(a o j){a A x) : X e X} 
= y {a A X : X e X} 

which shows that A is meet-continuous. Finally, 7([s]) 7^ 7([s']) implies that [s] 7^ [s'], 
and 7 being A-continuous, 7([s]) fl 7([s']) = 7([s] A [s']) = 7(-L) = 0, which proves that 
{7(0} : a G A} is a (set-theoretic) partition of 5. ■ 
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What we need now in order to complete our framework is to define elementary widening 

operators V such that: P < PV P' and T{P') C Y{P V P'). There are of course many ways 
to define these operators. When working with Pnr(A), the most precise widening operator can 
be defined by: 

PVP' = (P U P') - {a' G P', 3a G P : a' < a} 

In fact, this widening operator behaves rather like a join operator. More generally, at each step 
of the computation, one can choose (either deterministically or not) a subset Pq of the current 
invariant P to coalesce. The idea of such a generalization is to replace P by: 

(P U {ao}) - {a G P : a < ao} 

where ao is any element greater than V -Po- Of course, one might choose to define a very poor 
widening, which does not improve the expressible properties of the framework, by: 

PVP' = {\/(^UP')} 

It is easy to see that these definitions turn I?ir(A) into a representation framework abstracting 
the lattice L whenever the "generalizations" are properly used. It is difficult to say more about 
these generalizations since widening operators are well known to be highly lattice-dependent. 
When working with V^avi-^), widening operators are even more difficult to define in the general 
case, and we shall only develop an example in section 5.1. Note that in practice, one will 
always work with the set of finite strongly non-redundant subsets of A which is generally not 
a cpo, so the completeness of Psnr(^) will not help. Finally, note that the definition of the 
widening operator has a great influence over the quality of the result of the computation, as 
we shall see in section 5.2. Therefore, the general idea one should follow in the definition of a 
widening operator (V)ieN should be to use very precise elementary operators (i.e., join-like) 
at the beginning of an iteration sequence, and to generalize only after these operators have 
precisely defined the "shape" of the least fixed point. However, as we shall see in section 
5.2, there are also cases where it can be a good idea to alternate join-like operators and 
generalizations. 

4.2 Basic functional partitioning 

A central problem in abstract interpretation is to find a safe approximation of a least fixed 
point F that belongs to a functional lattice C ^ B, where (P, ±, T, C, U, n) is itself a lattice. 
For instance, for very simple, non-recursive programming languages, C is usually the finite 
set of lexical control points, and B the powerset of run-time memory states. But for more 
compUcated, recursive languages, a control point is more naturally defined as a subpart of the 
run-time stack, and C is infinite. More generally, there are cases where it can be interesting 
to consider that control points are indeed execution traces and not only static control points. 
Finally, in the minimal function graph approach, C is the set of admissible inputs of program 
functions, and B is the lattice of possible outputs, the bottom value of B being used to denote 
nontermination. In this paper, we shall refer to the elements of the possibly infinite set C as 
control points. 
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Very often however, there is no need to know the value of F for every control point, and 
it is sufficient to determine a safe approximation Fi □ |J{-^('^) • £ Ci}, of the values taken 
by F over each subset of a given partition {Ci}i^i of C This partition can be defined in a 
very natural way by assigning a token to each control point. This method has been proposed 
for instance in Sharir and Pnueli [13] and Jones [9], where tokens are used to group execution 
traces and coalesce the memory states associated with them. 

If the set of tokens is finite, then the framework is said to be partitioned (see Cousot [6] p. 
315) and the problem is equivalent to the resolution of a finite system of semantic equations. 
This is the case for instance in Bourdoncle [1], where a token is assigned to each run-time 
stack. These tokens model the "shape" of the stack (pointers, control stack. . . ) and generalize 
the tokens used in Sharir and Pnueli [13] that only took into account the control part of the 
run-time stacks. 

However, if the set of tokens is infinite (or very large) and one has no idea of a good way of 
defining a finite partition, then the original problem of finding a safe and finitely represented 
approximation of F remains to be solved. The idea is then to "lift" F so that it operates on 
sets of control points, and to dynamically calculate a partition of C, instead of it being "hard 
wired". 

We are going to study two general methods for doing this dynamic partitioning. For each 
of these methods, we suppose that there is a basic (and supposedly not satisfactory) way 
of finitely representing sets of control points, and we intend to build a representation from 
this initial approximation. We shall therefore suppose that {A, _L, T, <, V, A) is an upper 
approximation of (P(C), 0, C, C, u, n), and call (a, 7) the Galois connection between the two 
lattices. We shall also suppose that 7 is one-to-one and that 7(-L) = 0, which implies that 7 
is A-continuous. The elements of A will be called abstract control points and the elements 
of T = {[c] : c G C}, where [c] = a({c}), will be called the tokens. Theorem 7 shows that 
whenever T is strongly non-redundant, then A is an algebraic atomic lattice, and T defines a 
partition of C, but this hypothesis will not be necessary. Our definition therefore generalizes 
the classical notion of token. Finally, the elements of B will be called abstract values. 

The first representation that we shall define is based on the very naive observation that every 
abstract control point a E A implicitly defines a (possibly infinite) set of control points 7(a) 
that we shall informally call a "region". Therefore, an easy way to approximate a function 
from C into B is to "cover" the region over which this function is different from _L by a finite 
subset of A, and to associate an abstract value h with each element a of this subset. 

If the regions of such a representation P C A x 5 do not overlap, the natural meaning of 
P will map every control point c to the unique abstract value h associated with the element a 
by which its token [c] is covered, or to _L if its token is not covered. However, if the regions 
overlap, the meaning of P can be defined in several ways. We shall study in this section the 
most natural idea which is to map every control point c to the union of the abstract values 
associated with the abstract control point a with which its token intersects. We will show that 
these representations can be constrained in order to form a partial order compatible with this 
meaning, and then explain how widening operators can be effectively designed. We shall then 
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study, in the next section, a different and non standard meaning of overlapping representations 
that is better suited to generalization. 

Definition 8 A subset P of Ax B is said to be normalized if: 

V (a, 6) G P, V (a', h') e P : a = a' =^ b = b' 

For every normalized subset P, and a, ai,a2 G A, we define: 

AiP) = {ae A:3b e B : {a,b) e P} 

C{P) = \J{j{a):{a,b)eP} 

P(a) = U{beB ■.{a,b)eP} 

P\ai,a2) = {b : {a,b) e P A ai < a < a2} 

The set A(P) is called the domain of P, and the abstract value P(a) is the image of a by P. 
The set C(P) is the concrete domain of P, i.e., the "region" of C covered by the domain of 
P. For a normalized subset P of A x B, the image P(a) of every element of the domain of 
P is the unique element b such that (a, b) G P. We call P(A, B) the set of normalized subsets 
P whose domains do not contain _L, and Pnr(^, B) (resp. Psnr(^, B)) the set of normalized 
subsets which have a non-redundant (resp. strongly non-redundant) domain. Obviously: 

P,nr(A,B) C FUA,B) C F(A,B) 

We then define the meaning r(P) of a representation P G P(A, B) by: 

r(P)(c) = Mp[c] 

where the monotonic function Mp : A ^ B is defined by: 

Mp(x) = □ P(a) 

aeA(P) 
a/\x^ _L 

When T is a strongly non-redundant basis of A, we obviously have: 
and therefore: 

r(P)(c) = \_\{b:{a,b)eP A [c]<a} 

which states that each control point c is mapped to the union of the abstract values b attached 
to the elements of A(P) by which its token [c] is "covered", or to _L if its token is not covered. 
Note that if A(P) is strongly non-redundant, then an element of the basis is at most covered by 
a single element in A(P). It is worth mentioning at this stage that although any set of tokens 
can be chosen, it seems reasonable to impose that T be strongly non-redundant. To see the 
problem, let us chose C = Z, A = Zj_, and [c] = c. Then 7(c) = , ... ,c}, the lattice 
(A, -L, ct;"*", <, max, min) is totally ordered, and: 

y a,a' e A : a ± A a^-L <^ a A a' ± 
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which shows that r(P) is constant over C and: 

r(P)(c) = □{6:(a,6)GP} 

Intuitively, all the "regions" defined by the abstract control points overlap, and therefore, there 
is no way to distinguish between the different abstract values 6 G 5 of the representation. We 
are now going to show that Pnr(^, B) and Psnr(^, B) can be turned into partial orders. 

Theorem 9 I?ir(A, B) andV^aA-^^ B) are partial orders for the binary relation: 
P <P' ^ y{a,b)eP,3 {a', b') e P' : a<a' A bnb' 
and the meaning function F is monotonic over Pnr(^, B) and Psnr(^) B). 

The proof is straightforward. Note that every function F in C B can always be finitely 
abstracted by {(T,T)} and, when T is non-redundant, it can also be safely abstracted by 
{([c],P(c)) : c G C}. Therefore, defining a widening operator over Pnr(A,5) will turn 
Pnr(A, B) into a representation of C B. Elementary widening operators can be defined as 
follows: 

• We first define the domain of P V P' by: 

A(PVP') = A{P)VbA{P') 
where \^ is any basic partitioning widening operator defined in section 4.1. 

• Then, for every a in the domain of P V P', we define the image of a by: 

(PVP'Xa) = b 
where 6 G P is any abstract value such that: 

b □ Mp{a)U Mp'ia) 

To prove that P < PW P', we remark that by definition A(P) ^ A{P V P'), and therefore: 

V (a, 6) G P, 3 a' e A(P VP') : a < a' 

hence: 

b = P(a) C Mpia) C Mpia') C Mpia') U Mp'ia') C (PVP')(a') 
and thus: 

3 (a', b')ePV P' : a<a' A b H b' 

Let us now prove that r(P') C r(PVP'). So let {a',b') G P' and a; G A be such that 
a; A a' 7^ _L. By construction, we have: 

C(P') = U 7(a) C U 7(a) = C(P V P') 

aeA(P') aeA(PVP') 
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and therefore, there exists a pair (a, 6) G PVP' such that a A a; A a' 7^ _L, otherwise, 7 being 
A -continuous: 

7(2; A a') = 7(2; A a') n 7(a') 

c 7(2; A a') n UaG^(P') tC^) 

^ Uae^(PVP') (')'(^ ^ ^ 

^ Uae^(PVP')')'("^ ^ ^ = ^ 

and thus, 7 being one-to-one, a; A a' = ±, which is absurd. Consequently, a A a' ^ ±, and 
therefore: 

h □ Mp{a) U Mp<{a) □ A^p/(a A a') □ h' 

which shows that: 

Mp<{x) = □ (6') C □ (6) = A^pvp'(a:) 

{a',b')eP' {a,6>gPVP' 

and hence: 

r(p') c r(PVP') 

Provided that condition (t;) of definition 1 is satisfied, (PnriA, 5), ^, F, V) is thus a represen- 
tation of the functional lattice C ^ B. 

4.3 Functional partitioning 

The main interest of basic functional partitioning is that it is indeed very natural and easy 
to understand: the value mapped to a control point c by the meaning of a representation P 
is defined as the union of the abstract values b associated with the abstract control points a 
which have "something in common" with its token [c], i.e., [c] A a 7^ _L. But basic functional 
partitioning has several shortcomings. 

Firstly, as we noted earlier, basic functional partitioning is reasonably applicable only when 
the lattice A is an algebraic atomic lattice with a strongly non-redundant basis. This can be very 
annoying when approximating higher order functions for instance, since abstract functional 
lattices do not generally have a strongly non-redundant basis. 

Secondly, there are cases where the ordering < over Pnr(^, B) is not appropriate, and 
one would like abstract control points of representations to be maintained during iterative 
computations. This happens to be the case for interprocedural abstract interpretation since 
abstract control points naturally correspond to function calls in the fixed point computation 
algorithm, and the abstract call graph, which is generally needed to determine the set of 
recursive procedures, is therefore defined in terms of abstract control points. Of course, this 
goal can be easily achieved by slightly modifying the definition of < as follows: 

V (a, 6) G P, 3 (a', h') e P' : a = a' A b H b' 

However, this ordering has a major drawback with respect to the definition of widening 
operators in that, intuitively, the only way to generalize a representation P without losing too 
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Figure 1: Basic functional partitioning vs. functional partitioning. 



much information is to "pave" C using strongly non-redundant subsets of A, because every 
extra element (a', V) added to P leads to a loss of information for every token "covered" by 
a' . This is illustrated by the left part of figure 1 in the case of the interval lattice I(Z^). But this 
pavement can turn out to be much too complicated when working with sophisticated lattices 
such as the linear inequalities lattice for instance. Worse, it can even be impossible to define 
such a pavement for atomic lattices that do not have a strongly non-redundant basis. What we 
would like to do is therefore to map small regions {ai\i^i of C to a given set of values {hi}i^i, 
while defining some kind of default value b' for a larger and possibly overlapping area a', as 
illustrated by the right part of figure 1 . This can be achieved by generahzing the definition of 
the meaning function F to every element P of P(A, B) by: 

F(P)(c) = Mp[c] 

where Mp : A ^ B is defined by: 

M.p(x) = \_\aeA{P) Vp(a A X, a) 

and Vp(u,v) = H P\u,v) 

Given two abstract control points u and v, Vp{u, v) is equal to the greatest lower bound of 
the abstract values b associated with the abstract control points a that belong to the (possibly 
empty) convex subset {x : u < x < v}. It is easy to see that this function is increasing 
in its first argument and decreasing in its second argument. The function A^p is therefore 
monotoiuc and maps every abstract control point x £ A to the union of the abstract values 
b associated with the minimum elements a G A(P) such that x A a ^ _L. The meaning of 
a representation in Pnr(A, B) is thus identical to the meaning defined in the previous section, 
since every element of a non-redundant domain A(P) is minimum. The meaning of Mp 
for the representation P = {{a, b), {a', b'), {a", b")} and three particular values a = [2, 13], 
a' = [10, 18] and a" = [6, 21], is illustrated in figure 2 in the case of the interval lattice I(Z). 
The function M. p maps an abstract control point, represented by a point on the plane using 
the usual encoding of intervals: 

[x,y] ^ {ix+y)/2,iy-x)/2) 
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Figure 2: Meaning of A^p for P = {(a, b), (a', b'), (a", b' 



to an abstract value in B. The function r(P) : Z ^ B maps every integer k to A^p[A;], 
where the token [k] = [k,k] is the one-point interval. Note how the intervals below a' are 
"protected" from the value b" associated with the abstract control point a". 

The problem however with this new meaning function F is that it is not monotonic 
over P(A,B). Intuitively, non-monotonicity arises when an element {a',b') is added to a 
representation P and a' "masks" a region previously mapped to a value greater than b'. This 
would be the case for instance in figure 2 if (a', b') was added to {{a, b), {a", b")} and b' C b". 
In order to avoid this situation, one can require that b' be greater than Sp(a'), where the 
smallest safe value Sp(x) is defined by: 



Intuitively, this condition ensures that the new value b' is at least the union of every value that 
a' could mask, i.e., the values associated with the minimum elements in A{P) that are above 
a'. This intuition can be formalized by defining the relation ^ over P{A, B) as follows: 



Note that since Sp{a) = P(a) for every a in the domain of P, this condition could also be 
written as follows: 



Sp(x) 



U Vp(x,a) 



±<x < a 



P < P' 




V (a, 6) G P, 3 (a', V) e P' : a = a' h b \Zb' 

V (a', b') eP' : a' ^ A{P) ^ 6' □ Sp{a') 




A{P) C A(P') 

V {a', b') eP' : b'^ Sp(a') 



Theorem 1 0 (P(A, B),<)is a partial order, and T is monotonic over P(A, B). 
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Proof. Let us first show that F is monotonic over P(A, B). So let us suppose that P < P' . 
Then A{P) C A{P'), and for every x e A such that a; 7^ _L : 

{Spi(x) □ |Jae4(P) 'Dpi{af\x,a) 
±<x<a 
M.pi(x) □ |Jae4(P) Vpi(aAx,a) 

But for every a G A{P) such that either a;Aa7^_Lor_L<a;<a: 

Vp'{a A x,a) = n^a',b')ep' (^') 

aAa; <a'<a 

Let us now suppose that there exists (a', 6') G P' such that a A x < a' < a. Then by 
hypothesis: 

6' □ Sp(a') = \_\ Vp(a',a") □ |J Vp(a',a") 

a"eA{P) a"eA(P) 
a'<a" a'<a"<a 

But a A X < a' < a" < a implies that Vpia A x,a) Q 'Dp{a' , a"), and since a G A{P), the 
set {a" G : a' < a" < a} is non-empty and thus: 

h' □ Vp{aAx,a) 

which implies that: 

VpiiaAXjO) □ VpiaAXjO) 

and therefore: 

Spi(x) □ (Sp(a;) 
A^p/(a;) □ Mp(x) 

Consequently, since Sq{±) = M.q(±) = ± for every representation Q, then Mp ^ Mpi, 
Sp C Spi, and F is monotonic over V{A,B). Finally, < being trivially reflexive and 
antisymmetric, let us prove that it is also transitive. So let P < P' < P" . Then for every 
(a", h") G P" such that a" ^ A{P), either a" G A(P'), and thus h" □ 6' □ >Sp(a"), or else: 

b" □ 5p/(a") □ >Sp(a") 

which proves that P < P" . ■ 

We have proven that (P(A, 5), ^) is a partial order, but we must note that the meaning function 
is not strictly monotonic, i.e., one can find two distinct representations such that Pi -< P2 and 
F(Pi) = F(P2). This holds for instance whenever bi C 62 for: 

Pi = {([1, 1], b), ([2, 3], b'), ([1, 3], bi)} (i G {1, 2}) 

since: 

F(Pi) = F(P2) = {1^ b,2^ b',3^ b'} 

This problem can be solved by considering well chosen subsets of P(A, B), but we shall not 
study this problem here. We have thus defined a very flexible framework such that abstract 
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Figure 3: Widening in a functional partitioning framework. 



control points are added and never removed during a given computation. Moreover, the 
information associated with each control point is only allowed to increase. Finally, we have 
a very easy criterion to check whether or not a pair (a, 6) G A x B can be safely added to a 
representation, which is very important if one wishes to be able to generalize at some point of 
the computation. What we need now is to define elementary widening operators, i.e., operators 
such that: P ^ P V P' and r(P') C T{P V P')- 

In the rest of this section, we shall only consider finite representations, for they have the 
greatest practical interest, and for the sake of simplicity, we start by defining P" = P V P' for 
a singleton P' = {(a', 6')}. There are basically three cases in the definition of P". Each one is 
illustrated in figure 3, where we have taken A = I(Z), and P = {(ai, (a2, 62), {0.3, ^s)}- 

• If a' G A(P), then for every (a, h) E P such that a < a', the replacement of h by any 
element greater than h' l\h ensures that the meaning of P" is greater than the meaning 
of {(a', h')}, and at the same time that P < P" (fig. 3a). 

• If a' ^ A(P). Suppose one wishes to add this new abstract control point to the domain 
of the current representation P. Obviously (a', b') cannot simply be added to P. But 
adding any pair {a", b") such that a" > a' and b" □ 6' U Sp(a") will ensure that P < P" . 
However, as in the previous case, every (a, b) e P such that a < a" may "mask" the 
value b' to several elements of the basis. Hence, each value b must be replaced by an 
element greater than 6' U 6 to ensure that the meaning of P" is greater than the meaning 
of {(a', 6')} (fig. 3b and 3c). 

• There are cases however where a' ^ A(P) but one does not want to add a' to the 
domain of P. In fact, this will almost always be the case when working with finite 
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representations. There are then two subcases to consider. If 7(a') C C(P), every 
control point c represented by a' is already represented by at least one element a such 
that (a, h) E P and [c] < a, and replacing 6 by 6 U 6' will ensure that the meaning of P" 
is greater than the meaning of {(a', 6')} — provided of course that, as in the previous 
cases, every h" such that (a", h") G P and a" < a be replaced by an element greater 
than h' U h" (fig. 3d). But if 7(a') ^ C(P), then the region defined by a' contains "new" 
control points, and there is no way to avoid the addition of a' to A{P). The previous 
case must thus be applied, choosing for instance a" = T. 

Note that, in practice, the test 7 (a') C C(P) will always be approximated, and a given 
abstract control point a' will not be added to the domain of P only if there exists 
(a, h) E P such that a' < a. Such an approximation will thus generally imply the 
non-stability of the widening operator (cf. section 3). 

This definition shows that functional partitioning is well suited to generalization processes, for 
it enables one to easily generalize without losing too much information. In order to complete 
our framework, we now define PV Q for any finite representation Q G P(A, B) by arbitrarily 
numbering the elements (a^, bi),i £ [l,k]of Q, and adding them one at a time to P, i.e.: 

PVg = {{PVQi)---)VQk 

where Qi = {(a^, bi)}. This definition trivially implies that: 

PVg t {{PV Qi)---)V Qk-i h ■■■ h P 

and thanks to the next theorem: 

r(pvg) = mPVQi)---)VQk) 

□ r(gi) u • • • u ng^) 

□ r(giu---ugfc) 

= nQ) 

which shows that condition iv) of definition 1 is satisfied. 

Theorem 1 1 For every Qi,Q2 G P(A, B) such that Q1UQ2 £ P(A, B): 

ngiuga) e r(gi)ur(g2) 

Proof. We first remark that for every u,v G A: 

{Qi^Q2)\u,v) = Qi\u,v)[jQ2\u,v) 

thus: 

Vq^uQ2(u,v) = VQ^(u,v)nVQ^(u,v) C Vq^(u,v), Vq^(u,v) 
and therefore: 

•MqiUQsC^) = U<'e^(<3ii-'Q2) ^'QiUQsC'^ ^ ^) 

= U^e^Wi) ^QiugsC*^ A a;, a) U |Jae4(Q2) Vq^yjQ^^a A x, a) 
C Mq,{x)UMq,{x) 
which proves that T{Qi U Q2) E r(<?i) U T{Q2). ■ 
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Figure 4: Widenings overI?ir(I(Z^)). 



5 Applications 

We are now going to present two possible applications of dynamic partitioning. The first one 
is simply the application of basic partitioning to the bi-dimensional interval lattice I(Z^). Its 
interest is rather academic, but we shall use this example to illustrate how widening operators 
can be effectively built. In the second example, we will show how a precise description of the 
input/output behavior of a program function can be computed using functional partitioning. 
This example will exemplify the case where the shape of a program invariant cannot be 
predicted and has to be considered as an output of the fixed point computation itself. 

5.1 iVIulti-intervals 

The aim of this section is to show how sets of bi-dimensional intervals can be used to 
represent sets of integer pairs. Following the method developed in section 4.1, we can either 
use non-redundant subsets or strongly non-redundant subsets of I(Z^). Note that strongly 
non-redundant subsets of I(Z^) are always larger than non-redundant subsets. We are going to 
illustrate the ideas that can be used to build elementary widening operators over such subsets. 
Figure 4a shows an element P = {ai,a2, as} of Psnr(I(Z^)) plus an extra element a' G I(Z^). 
We wish to calculate P' = P V {a'}. Figure 4b illustrates how P' can be defined using a 
join-like operator. Note that such an operator might be very difficult to implement. So let 
us focus on Pnr(I(Z^)). We shall define two elementary operators. The first one Vj (fig. 4c) 
behaves like a join operator and shall be used in the first steps of a computation. The second 
one \^ (fig. 4d) computes a generalization as follows. Using the widening operator Vi over 
I(Z^) defined in section 2, one first computes a" = (V P) Vi a'. Intuitively, y P is used as a 
reference to determine in "which direction" a' is "moving". Of course, different references can 
be used, such as the most recently added elements of P for instance. Finally, P' is calculated 
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Figure 5: Safe representation of Ifp(^). 



by removing the redundant elements of P U {a"}. We shall use these two elementary widening 
operators to compute a non-trivial, safe and finitely represented approximation of the least 
fixed point of <I> : PCN^) ^ PCN^) defined by: 

^(M) = {{2,l),{2,2)}U{{[xy/2\,x+y)}^,^y)^M 

This least fixed point cannot be finitely represented in any usual lattice used for modehng 
sets of integer pairs, such as the linear inequalities lattice of Cousot and Halbwachs [3] for 
instance. But one can very easily define a safe approximation <I>* of ^ by: 

^*(P) = {{[2,2],[l,2])}U{^*{X,Y)}{x,Y)eP 

where: 

^*([i,s],[i',s']) = {[[ii'/2\,[ss'/2\],[i + i',s + s']) 
Then, using the framework of section 3 with the widening operator: 

V = (V/V/) = (V,-,V,-,V,-,V,---) 

one can finitely compute the following non-trivial approximation displayed in figure 5: 

{([2, 2], [1,2]), ([1,2], [3,4]), ([1,4], [4, 6]), ([2, 12], [5, 10]), ([5, a;+], [7, a;+])} 

5.2 Minimal function graphs 

We are now going to present an application of functional partitioning to interprocedural 
abstract interpretation, which in fact originally motivated this work. Let us suppose that one 
has a program function <t> : Z ^ Z± such as the Loop function introduced in section 2 or 
MacCarthy's 91 -function defined by: 
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fun Mc n = if (n > 100) then 
n-10 
else 

Mc(Mc(n + 11)) 



We wish to determine a safe approximation of the minimal function graph of <I> for a given 
set of input data specifications. We are not going to formally describe a minimal function 
graph semantics but rather give an intuition about the way finite representations of minimal 
function graphs can be computed using functional partitioning. As hinted at the end of section 
2, we shall abstract minimal function graphs using representations in P(I(Z), I(Z)), and input 
data specifications using representations in I^r(I(Z)). So let us suppose that we have an initial 
representation Jq of the set of input data specifications. We can define the first representation 
of the minimal function graph of <I> with respect to the input data specification Jq by: 

Po = {(io,-L)}ioe/o 
The meaning of a representation P is the one introduced in section 4.3, that is: 

T{P)(n) = Mp[n,n] 

where: 

Mp(x) = y /\ {v' -.(x Ai)<i' < i}{i>y)^p 
{i,v)eP 

x/\i ^ _L 

Note that, contrary to what is proposed in Jones and Mycroft [10], we have not introduced 
a special value "!" to denote non-termination. Therefore, at the end of the computation, 
r(P)(n) = _L either means that 0 has never been called with n as argument or that it has been 
called and looped. Note that this is not too important since these two interpretations can be 
easily distinguished by looking at the domain of the representation. The approximate minimal 
function graph is therefore the limit of the increasing chain defined by: 

where 0*(P) is defined as follows: 

1) For every {i,v)'mP, an updated value v' ^ voivis, computed by applying the definition 
of <I> to the set of values denoted by i and replacing the values of the recursive calls 
^ii') by M.p{i'). The latter is the best approximation of <I>(i') that can be given using 
the current approximation of <I>. 

2) Then<I>*(P) = {(z, Vit;')}^i ,,^£p \4 {(«', -L)}i'G/'' where P is the set of new abstract 
control points over which ^ has been called in step 1 . 

In other words, we compute an updated approximation v' of the value of <I> over each 
abstract control point i in the domain of the representation P, and take into account the fact 
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that recursive calls have generated new abstract control points i' by inserting these intervals 
into the representation. 

The insertion of the updated value v' can be done for instance in a fairly simple way by 

using the usual widening operator Vi over I(Z) defined in section 2, and replacing {i, v) in the 
initial representation by (z, v Vi v'), in order to make sure that the increasing chain of abstract 
values (v, v Vi v', . . .) will be eventually stable. Of course, at the beginning of the iteration 
sequence, it is also safe to replace {i, v) by {i, vW v'). 

The insertion of the new abstract control points i' into the representation is more subtle, and 
uses the elementary widening operator V defined in section 4.3. Obviously, it is generally 
unsafe to add directly (z', _L) into the representation since, as discussed in section 4.3, i' 
might "mask" one of the intervals i in the domain of the representation and thus invalidate its 
meaning. The smallest pair that can be safely inserted is therefore {i', Sp(i')). But abstract 
control points themselves need to be generalized in order to enforce a finite computation, and 
at some point of the iteration sequence, we will have to replace the interval i' by a greater one 
i", e.g., the maximum element T. This can be formalized by introducing three elementary 
widening operators defined as follows. 

(Va) Add the pair {i', Sp(i')). This is the most precise, join-like, widening operator. 

(\^) When it is safe not to add i', i.e., when the region covered by i' is already covered by 
the domain of P, then do nothing, otherwise generalize by adding {i", Sp(i")), where 
i" > i'. A good choice can be for instance i" = (V A(P)) V i', i.e., the smallest interval 
representing all the values over which ^> has been computed so far, in which case 
Sp(i") = ±. 

(Vc) Finally, to avoid adding an infinite number of abstract control points, one can use the 
widening operator over the intervals and add ((V A{P)) Vi z', _L). 

Of course, the choice of the sequence of elementary widening operators is essential. The first 
elementary widening operator Va will generally be used at the beginning of the computation, 
and Vc will systematically be used at the end. Moreover, it is often useful, after having 
generalized using V, to make a few more precise steps using \4 or The motivation behind 
this choice is that once the domain of the minimal function graph has been delimited, a few 
more precise steps are generally needed to determine the abstract control points that are useful 
to precisely describe this graph and allow these intervals to "propagate" along recursive calls. 

Finally, note that the insertion of the updated abstract values v' and the insertion of the new 
abstract control points i' can be freely mixed in pratice, and newly generated control points 
can be added on the fly to the representation without problem. 

The widening operator that we have described turns the functional partitioning framework 
into a tractable framework. So for instance, using the widening operator Va Vc'^), one can 
automatically compute, after 4 iterations, the following representation of the minimal function 
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graph of Loop for the input data specification {[0, 0]}: 

{([0, 0], [100, 100]), ([0,a;+], [100,a;+]), ([1, 1], [100, 100]), ([1, 100], [100, 100])} 
which has the following meaning: 



i 


Loop(z) 


0 < n < 100 
100 < n 


[100,100] 
[100, 



This result is interesting in that it shows that the exact information: Loop [0, 100] = [100, 100] 
has been obtained, as opposed to section 2, and this has been achieved without the help of a 
narrowing operator. However, contrary to the result of section 2, the approximate minimal 
function graph seems to indicate that the computation of Loop(O) might require computing 
Loop for values greater than 100, but starting this time from the input data specification 
{[0, 100]}, and using the "brute force" widening operator (\^'^) we can compute the following 
representation: 

{([0,100], [100, 100])} 

which invalidates this interpretation. Similarly, using the widening operator ^'^) one 

can compute, after 4 iterations, the following representation of the minimal function graph of 
Mc for the input specification {[0, 50]}: 

{([0, 50], [91, 91]), ([0,a;+ - 10], [91,a;+]), ([11, 111], [91, 101]), ([11,61], [91, 91]), 
([22, 72], [91, 91]), ([22, 111], [91, 101]), ([91, 101], [91, 91])} 

This representation has the following meaning: 



n 


Mc(n) 


n < 0 


_L 


0 < n < 72 


[91,91] 


73 < n < 90 


[91,101] 


91 < n < 101 


[91,91] 


102 < n < 111 


[91,101] 


112 < n 


[91,a;+ - 10] 



which is a good and safe approximation of the exact meaning of Mc, i.e.: 

„ , , / n - 10 if n > 101 
Mc(n) = < , . 

I 91 otherwise 

It is interesting to compare this result to the one obtained in Bourdoncle [1] using a method 
based on static partitioning. In this method, the representation of MacCarthy's 91 function 
would consist of three interval pairs, each pair being associated with a syntactically different 
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call to Mc, that is, the main call to Mc and the two recursive calls. This formally corresponds to 
having three mutually recursive functions Mcl, Mc2 and Mc3 with the following definition: 

if (n > 100) then 
n-lO 

else 

Mc3(Mc2(ii + 11)) 

and describing each of these functions by a pair of intervals representing all of the function's 
inputs and all of the function's outputs. The result obtained is the following: 

Mcl : ([0,50],[91,a;+- 20]) 
Mc2 : ([ll,a»+],[91,a;+- 10]) 
Mc3 : ([91, a;+ - 10], [91, a;+ - 20]) 

This quite mediocre result can be explained by noting that the induction property: 

Vn G [91,101] : Mc(n) = 91 

has not been inferred by the framework because the number of interval pairs was fixed in 
advance. This phenomenon can be worked around by using an ad-hoc input data specification, 
namely {[0, 100]}, which gives the following, optimum, result: 

Mcl : ([0,100], [91, 91]) 
Mc2 : ([11, 111], [91, 101]) 
Mc3 : ([91, 101], [91, 91]) 

However, this "trick" is not necessary when using the functional partioning framework, since 
this framework infers the interesting program properties by itself, and automatically determines 
the number of interval pairs needed to describe the program invariant. Howevever, it is worth 
mentioning that the widening operator has a major impact on the result's quality, and for 
instance, the "brute force" widening operator would only compute, after 2 iterations, the 
following, mediocre but concise, representation: 

{([0,50], [91,a;+ - 10]), ([0,a;+], [91,a;+ - 10])} 

with the obvious meaning: 

Vn G [0,a;+] : Mc(n) G [91,a;+ - 10] 

This example shows that the data-oriented approach of dynamic partitioning is much more 
versatile than the syntax-oriented approach of static partitioning, and generally gives better 
results. But on the other hand, static partitioning guarantees the size of the least fixed point's 
representation, and can lead to faster analyses. Finally, note that the two approaches can 
be easily mixed. For example, using the widening operator Va Vc'^) and the input data 
specification ({[0, 50]}, 0, 0), one can compute, after 5 iterations, the following representations: 
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Mcl : {([0,50], [91, 91])} 

Mc2 : {([ll,61],[91,91]),([ll,a;+],[91,a;+-10]), 

([22, 72], [91, 91]), ([22, 111], [91, 101])} 
Mc3 : {([91, 101], [91,91])} 

which have the following, coalesced meaning, obtained by intersecting their individual 
meanings: 



n 


Mc(n) 


n < 0 


_L 


0 < n < 72 


[91,91] 


73 < n < 90 


[91,101] 


91 < n < 101 


[91,91] 


102 < n < 111 


[91,101] 


112 < n 


[91,a;+ - 10] 



6 Conclusion 

We have presented a technique that enables rich abstract interpretation frameworks to 
be built from simpler ones even in cases when one has no indication about what such 
frameworks should look like. We believe in particular that functional partitioning is of great 
interest to interprocedural abstract interpretation for it incrementally builds finite, non-trivial 
representations of minimal function graphs and monotonic functions. More generally, the 
representation framework can be used every time there is no canonical representation of 
abstract program properties and the equivalence test over these properties is intractable or very 
costly. 

We have shown how widening operators can be built in dynamic partitioning frameworks, 
and exemplified their behavior over a set of examples. However, this paper has not addressed 
a number of interesting problems such as the effective design of narrowing operators, 
the combination of forward and backward analyses, and the generalization of functional 
partitioning to higher order functions. 
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